Classification of microarrays to nearest centroids
نویسنده
چکیده
MOTIVATION Classification of biological samples by microarrays is a topic of much interest. A number of methods have been proposed and successfully applied to this problem. It has recently been shown that classification by nearest centroids provides an accurate predictor that may outperform much more complicated methods. The 'Prediction Analysis of Microarrays' (PAM) approach is one such example, which the authors strongly motivate by its simplicity and interpretability. In this spirit, I seek to assess the performance of classifiers simpler than even PAM. RESULTS I surprisingly show that the modified t-statistics and shrunken centroids employed by PAM tend to increase misclassification error when compared with their simpler counterparts. Based on these observations, I propose a classification method called 'Classification to Nearest Centroids' (ClaNC). ClaNC ranks genes by standard t-statistics, does not shrink centroids and uses a class-specific gene-selection procedure. Because of these modifications, ClaNC is arguably simpler and easier to interpret than PAM, and it can be viewed as a traditional nearest centroid classifier that uses specially selected genes. I demonstrate that ClaNC error rates tend to be significantly less than those for PAM, for a given number of active genes. AVAILABILITY Point-and-click software is freely available at http://students.washington.edu/adabney/clanc.
منابع مشابه
ClaNC: point-and-click software for classifying microarrays to nearest centroids
SUMMARY ClaNC (classification to nearest centroids) is a simple and an accurate method for classifying microarrays. This document introduces a point-and-click interface to the ClaNC methodology. The software is available as an R package. AVAILABILITY ClaNC is freely available from http://students.washington.edu/adabney/clanc
متن کاملBIOINFORMATICS Classification of Microarrays to Nearest Centroids
Motivation: Classification of biological samples by microarrays is a topic of much interest. A number of methods have been proposed and successfully applied to this problem. It has recently been shown that classification by nearest centroids provides an accurate predictor that may outperform much more complicated methods. The ”Prediction Analysis of Microarrays” (PAM) approach is one such examp...
متن کاملOptimality Driven Nearest Centroid Classification from Genomic Data
Nearest-centroid classifiers have recently been successfully employed in high-dimensional applications, such as in genomics. A necessary step when building a classifier for high-dimensional data is feature selection. Feature selection is frequently carried out by computing univariate scores for each feature individually, without consideration for how a subset of features performs as a whole. We...
متن کاملClass prediction by nearest shrunken centroids,with applications to DNA microarrays
We propose a new method for class prediction in DNA microarray studies, based on an enhancement of the nearest prototype classi er. Our technique uses \shrunken" centroids as prototypes for each class and identi es the subsets of the genes that best characterize each class. The method is general, and can be used in other high-dimensional classi cation problems. The method is illustrated on data...
متن کاملRegularized Discriminant Analysis and Its Application in Microarrays
In this paper, we introduce a modified version of linear discriminant analysis, called “shrunken centroids regularized discriminant analysis” (SCRDA). This method generalizes the idea of “nearest shrunken centroids” (NSC) [Tibshirani et al., 2003] into the classical discriminant analysis. The SCRDA method is specially designed for classification problems in high dimension low sample size situat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 21 22 شماره
صفحات -
تاریخ انتشار 2005